Data mining system, method and apparatus for industrial applications

ABSTRACT

A data mining system directed to industrial applications gathers data remotely from a plurality of industrial plants or sites and makes the gathered data available over a network to a user running a data mining application or on-line analytical process means to generate and, if desired, visualize results. Access to a service of this nature may be made available in connection with an on-line service, such as a website, that also provides access to industrial products and services. Subscribers to the on-line service, who may be charged a fee to subscribe, may be given an incentive to continue their subscription in the form of an offer of free or reduced-cost access to the above-described data mining system during their on-line service sessions.

[0001] This application is a continuation of application Ser. No.10/068,614 filed on Feb. 6, 2002 which claims the benefit, for purposesof priority under 35 U.S.C. §119(e), of U.S. Provisional PatentApplication No. 60/266,640, filed Feb. 6, 2001.

FIELD OF THE INVENTION

[0002] The invention relates to data mining and, more particularly, to asystem method and apparatus for mining data and providing servicesrelated thereto using a communication system, such as the Internet, inindustrial applications.

BACKGROUND

[0003] Data mining explores detailed business transactions to uncoverpatterns and relationships contained within a particular businessactivity and history. Data mining can be done manually by “slicing anddicing” the data until a pattern becomes obvious. A term of art in thefield, “slicing and dicing” refers to the ability to move betweendifferent dimensions of warehoused data. It can be performed withprograms that analyze the data automatically. Using computer technologyto look for hidden patterns in a collection of data, data mining formarketing research, for example, might reveal that customers interestedin one product will also be interested in another. In other areas, datamining can be useful in scientific research, economics, criminology, andmany other fields. In general, there exists specialized databasesoftware for data mining.

[0004] A data warehouse is a database designed to support decisionmaking in an organization. It is batch updated and can contain enormousamounts of data. Hence, the moniker “warehouse”. For example, largeretail organizations can have 100GB or more of transaction history. Whenthe database is organized for one department or function, it is oftencalled a “data mart” rather than a data warehouse.

[0005] The data in a data warehouse is typically historical and staticand may also contain numerous summaries. It is structured to support avariety of analyses, including elaborate queries on large amounts ofdata that can require extensive searching. When databases are set up forqueries on daily transactions, they are often known as operational datastores (ODSs) rather than data warehouses.

[0006] At present, there exist no data mining services relating toindustrial applications that provide data mining and data mart servicesas described above. Until now, data services for industrial applicationshave been offered by consultants who manually compile and audit data ona case-by-case basis. Such manual compilation may be inadequate toprovide a complete or accurate analysis of the industry beingresearched. Manual accounting methods, moreover, cannot provide acontinuous or virtually continuous update of the industry and aresubject to human error.

SUMMARY

[0007] The long felt, but unmet, needs described above are addressed byvarious aspects of the system and method according to the presentinvention. One aspect of the present invention, for example, provides amethod for retaining a client subscription to an on-line site thatprovides access to industrial products and services, including datamining services in connection with a data warehouse populated with datarelating to at least one industrial application. According to themethod, the on-line site provides access at least to industrial productsand services. A subscription is offered to the client to access theon-line site. If the client accepts the subscription offer, the clientis further offered at least some access to the data mining service freeof charge, though is otherwise charged a fee for subscribing to theon-line site. The client subscription is thereby retained by rewardingthe client with the data mining analyses service access free of charge.

[0008] In another aspect of the present invention, a method in a datamining system provides analysis services directed to industrialcontrol-related data originating at a plurality of client industrialsystems. The data mining system is in communication with the pluralityof client industrial systems, is accessible over a network by a user, isassociated with a data warehouse, and comprises at least one data miningapplication. The method comprises several steps. Industrialcontrol-related data is collected from the plurality of clientindustrial systems. The collected industrial control-related data isstored in the data warehouse associated with the data mining system.User access over the network to the at least one data mining applicationis provided. In response to the user-directed data mining application,data is retrieved from the data warehouse and processed. Processed datais then delivered over the network to the user.

[0009] In yet another aspect of the present invention, a method isprovided for delivering industrial control-related on-line services to auser during an on-line session. The method may comprise the followingsteps. The user is provided with access over a network to a data miningsystem during the on-line session. The data mining system comprises atleast one application in communicatoin with a data warehouse. The datawarehouse comprises data collected from among network-delivered dataoriginating with a plurality of industrial control systems. Theapplication allows the user to conduct analyses of data in the datawarehouse and to view the results of the analyses. The user is alsoprovided, during the same on-line session, with access to non-datamining industrial control-related content, the access provided duringthe same on-line session.

[0010] In a further aspect of the present invention, a method isprovided for generating a data structure comprising industrialcontrol-related content for presentation to a user over a network duringan on-line session. A document is generated for presentation over thenetwork to the customer during the on-line session. A first link isinserted into the document that, when selected by the user, points to asecond document relating to an industrial-control related data miningservice. A second link is inserted into the document, such that, whenselected by the user, it points to non-data mining industrialcontrol-related on-line content. The user is thereby provided access tothe industrial control-related data mining service as an incentive toalso access the non-data mining industrial control-related on-linecontent.

[0011] In still another aspect of the present invention, a data miningsystem provides analysis services directed to industrial control-relateddata originating at a plurality of client industrial systems. The datamining system is in communication with the plurality of clientindustrial systems and is further accessible over a network by a user.The data mining system is associated with a data warehouse and comprisesat least one data mining application. The system comprises datacollection means adapted for collecting data from the plurality ofclient industrial systems, a data warehouse coupled to the datacollection means and adapted for storing data collected from theplurality of industrial systems, on-line analytical processing meanscoupled to the data warehouse and adapted for analyzing industrialsystems data, and user-interface means for presenting to the user theresults of on-line analytical processing.

[0012] Various other aspects of the system and method according to thepresent invention are illustrated, without limitation, in the figures,the description below, and the appended claims.

BRIEF DESCRIPTION OF THE DRAWINGS

[0013]FIG. 1 is a diagram of an embodiment of a system according to thepresent invention.

[0014]FIG. 2 is a flow diagram of an embodiment of a method according tothe present invention.

[0015]FIG. 3 is a diagram of an embodiment of a system according to thepresent invention, illustrating real-time analysis.

[0016]FIG. 4 is a diagram of an embodiment of a business model accordingto the present invention.

DETAILED DESCRIPTION

[0017]FIG. 1 shows a system diagram 100. One or more remote sites 102,in this instance industrial plants or sites 102 a-102 c, are coupled toan on-line communication network 104, such as the Ethernet, Internet,Intranet, local area network or the equivalent. The connection betweenthe remote sites 102 and the communication network 104 is accomplishedin accordance with any of the known protocols including, but not limitedto, TCP/IP, ModBus, etc. Information or data is collected by acollection mechanism 106. The collection mechanism 106 in one aspect isperformed automatically by, for example, software. This may beimplemented using any of the known scripting languages, such as HTML orJava.

[0018] The data collected by the collecting mechanism 106 is stored indata warehouse 108. On-Line Analytical Processing (OLAP) 110, sometimesreferred to as “multi-dimensional analysis,” analyzes the data stored inthe data warehouse 108 according to a predetermined analysis routine.Such routine may be implemented automatically by software. A report 112is generated according to predetermined processing. In one aspect, theprocessing, which may be software driven, provides a user-specificreport 112 a in a predetermined format, such as a best practices chart112 b. As illustrated, the best practices chart may plot the efficiencyof a particular piece of equipment, such as an industrial machine. Asshown in FIG. 1, the best practices chart plots efficiency versus thefiring rate of the equipment.

[0019] In one implementation, the remote sites 102 are the clients,i.e., users. A client may be an industrial manufacturer that may wish tomeasure the efficiency of its plant or its equipment, for example, aboiler. Data collected from boilers at remote sites 102 a-c are uploadedthrough the communication network 104. The information is collected bycollection mechanism 106 and stored in data warehouse 108. The datastored in the data warehouse is data to be mined by the OLAP 110. In oneaspect, the OLAP 110 is designed according to information entered on thespecific type of equipment and model, boilers in this example, regardingthe efficiency of such equipment.

[0020] The data is sliced into data marts (or markets) according topredetermined parameters set either by the user or the OLAP 110 softwareprogrammers, i.e., the service provider. For example, the data may besliced into machines of a particular type, such as boilers. Such across-section, from equipment having a common characteristic, providesthe user with superior data points with which to compare the user's ownequipment. An example of such a common characteristic might be theexternal temperature of the equipment. This would populate the datamarket with information on equipment in similar temperature environmentsto determine, for example, the efficiency of equipment when thisexternal factor puts a load on the equipment. External temperature maybe obtained using isothermal maps available off the Internet. Othercommon factors, such as the age of equipment, geographic location andindustry usage may alone, or in combination, be factors taken intoaccount in accordance with the present invention.

[0021] The data market may comprise data with any predeterminedcharacteristic including data collected from equipment having dissimilarfeatures or used in disparate contexts. For example, it such a boiler ina warehouse, a school or a food processing plant. The user may use thedata market to ascertain a cross-section of information regarding theefficiency of boilers more effectively than with the existing method ofmanually compiling information provided by consultants.

[0022] With reference to the flow diagram shown in FIG. 2, the remotesites 202 (102, in FIG. 1), may be one or more locations (e.g.,industrial operations) (202 a . . . 202 n). The locations are incommunication with a central database, via a communication network 204.Scanner 206 periodically or continuously connects to one or more of theremote sites to collect data. The scanner may do this eitherautomatically or at the prompting of the service provider. In addition,the user may upload information at their discretion. In the boilerexample, the scanner may download fuel consumption data or steam outputinformation, for example. Of course, the scanner may download anypre-selected information that is selected for populating the datawarehouse.

[0023] A Data Transformation Service (DTS) 208 mechanism scrubs the datathat has been scanned and uploaded by the scanner 206. Data scrubbing isa well known technique that filters incoming data so that unnecessarydata is removed and does not waste data warehouse space. The DTS, forexample, may remove the internet routing address or other data that maynot be useful for determining the efficiency of the boiler in the aboveexample. The scrubbed data is then stored in data warehouse (D/W) 210.The Online Analytical Processing (OLAP) mechanism 212 analyzes andprocesses the data in the warehouse 210. The data may be sorted andredistributed into smaller data warehouses called data markets, ormarts. These data marts may include data representative of specificcharacteristics of the piece of equipment under consideration.

[0024] The DTS 208, Data Warehouse mechanism 210 and OLAP 212 may beimplemented using off-the-shelf software, hardware, firmware or acombination of them. Software and hardware for the DTS 208, DataWarehouse and OLAP 212 may be procured through well-known public vendorsor original equipment manufacturers (OEM). For example, the datawarehouse hardware and software is publicly available from Oracle Sequel2000. The OLAP processes may presently be obtained from Knosis, Inc. ofBoise, Id., e.g., its Knosis product, which may be launched usingMicrosoft Excel 2000. Other vendors include Applix, Inc. of Westboro,Mass., Brio Technology, Inc. of Palo Alto, Calif., and Business Objectsof San Jose, Calif.

[0025] Next, a chart 214, in this case a best practices chart, iscreated based on the result of the on-line analytical processing 212. InFIG. 2, the best practices chart may plot efficiency versus firing ratefor boilers from which data has been collected for two customers. Datacollected for each of the two customers is plotted relative to an idealbest practice curve. As shown in the figure, the plot for customer 1exceeds the ideal best practice at lower firing rates, but falls belowit at higher rates. This plot for customer 2 lies below the ideal bestpractice curve for all firing rates.

[0026] The industrial process best practice analysis of the presentinvention may be instantaneous as well as continuous, since the datafrom the various remote sites may be scanned and updated to the datamarket for analysis virtually immediately. Any combination of factorscan be selected to determine the choice of data, the characteristics ofthe data to be presented and the manner in which the data is presentedto the client, i.e., user. The present invention providesconfidentiality to the users because the users access the best practicesinformation through the Internet. It is a simple matter to maintainconfidentiality of the user located at the far end.

[0027] The best practices chart and the OLAP 212 processing will now bedescribed in more detail with respect to the diagram 300 shown in FIG.3. A boiler 302, for example, is equipped with sensors to detectoperating parameters such as the total fuel used or the total steamproduced. That information is uploaded, via the communication network,to the data warehouse. The information is “sliced” or otherwiseprocessed, by the scanner 206, DTS 208, or other means and is stored inthe data warehouse 304. In this instance, the data warehouse 304 islogically organized according to the arrangement shown in FIG. 3,wherein the data is collated according to equipment number 304 a, i.e.,boiler number, and one or more parameters relating to the respectiveequipment, i.e., total fuel 304 b or total steam 304 c. The knownefficiency equation that relates steam to fuel is used to generate theefficiency number 304 d that is stored, as well, in the data warehouse304. Other equations or relationships could as easily be used. Inaddition, the data warehouse 304 may be organized as multi-dimensionalabstract space, the dimensions of which are defined by two or morevariables. In the example shown in FIG. 3, the data warehouse may bearranged according to a 3-dimensional space 304 e. In another example, amulti-dimensional space may comprise dimensions corresponding to time,location and equipment parameters. Arranged in this manner, the OLAP 212sorts and processes the data according to one or more of the dimensions.

[0028] As shown in the figure, OLAP 212 may process the data in thewarehouse (or data mart) 304, thereby providing a real time analysis. Inthe figure, multiple types of OLAP processing charts 306 a to 306 c areshown. Each of the charts represents a different efficiency responsecurve for respective pieces of equipment. For example, chart 306 aillustrates a typical response curve, wherein the equipment functionswith a predicted and continuous efficiency over a range. Chart 306 billustrates an atypical problem scenario, wherein the plotted equipmentefficiency data briefly spikes at a particular point as a result of anequipment problem. Chart 306 c illustrates a situation in which theequipment operates nearly constantly over a period of time. OLAP 212 canarrange and manipulate the data to allow the user to flexibly view thedata from particular points of view. For example, the OLAP 212 canarrange the data according to geographic region, allowing for predictionof how processes might be run in relation to a particular feature commonto the region, such as temperature or altitude. Further, the OLAP 212can provide instantaneous snapshots of the functioning of the equipmentat any time. The user is able to view current equipment performance, orcan evaluate past performance trends in performance.

[0029] Another aspect of the present invention relates to a businessmethod and the process and system for implementing it. The methodprovides the industrial data mining tools in an on-line forum that alsoprovides industrial applications, products and services. In oneembodiment, the data mining services are provided at a discount or freeof charge to attract and retain customers that visit the on-line servicein order to cross-sell other industrial products and services. Inanother embodiment, subscribers that pay for data mining capability areblandished with other industrial products and services at a discount, aswell as other industrial-related content.

[0030] As shown in FIG. 4, business plan 400 employs a web site 402 thatsupports interactive use by clients 1 . . . n (404) 404 a, b and c. Theforum site provides customers with applications 406, services 408 and/orproducts 410. The site is designed to be interactive. That is, theclient is provided with the capability of providing data to the website. Such data may include news and/or articles 412 relevant to theindustrial market. The data may also include services 414 provided bythe client to either the web site or to other clients. Further, the datamay be equipment data 416 uploaded to the web site from the remote siteto populate the data warehouse.

[0031] In one aspect of the business plan, the clients are locked in aslong term customers by providing them with one or more of theapplication 406, services 408 or products 410 as free on-line offerings,or at a dramatically reduced cost, in exchange for a long-term,fee-based membership to the industrial products and services on-lineoffering. The on-line offering generates revenue by charging for asubscription fee in return for the right to be members of the web site.Alternatively, or in addition, use of one or more of the applications,products or services may incur a charge. The applications, products orservices may be provided on a surcharge or commission basis. Forexample, use of services provided by a client may incur a commission feein return for providing the forum in which the clients meet and arrangeto agree upon the thing to be exchanged.

[0032] In another aspect of the business plan, the clients are providedfree data mining services. This may include free access to the OLAPresource 212. That is, the client will be able to access the datawarehouse information and employ the OLAP resource 212 to slice the datain any manner that the client desires. This is done, for example, toobtain best practice information for industrial equipment, such asboilers. In return for such services, the client may pay a subscriptionfee for the on-line forum. In the alternative, the web site may providesuch free services in order to entice users to view and/or visitadvertisements and/or web sites relating to the content of the web site,i.e., industrial applications.

[0033] In other words, the business model may be characterized ascustomer driven: the customer provides the information and, indeed, mayprovide it to other customers. This differs from the traditional methodof providing manual consulting services, in which a vendor provides andmines the data for the client. The business model of the presentinvention may include other features as well. A proactive businessmodel, i.e., customer or service driven, may provide periodic reports tothe user. These reports may be in the form of weekly or monthly reports.The periodic reports may have a predetermined format designated by theuser. The report may be formatted to best communicate the information tothe client according to the user's particular industrial operation.

[0034] Another aspect of the business model is to provide a performanceguarantee service. In other words, in order to incentivize thepurchasing of products and services provided over the web site, thebusiness model offers a guarantee on the performance of such product orservice. According to a further implementation of the business model,the incentivization program may be provided as part of the subscriptionin order to attract users to the web site and entice them to become longterm members. In the alternative, users may purchase the guarantee on aper-item basis.

[0035] Another feature of the business plan is to provide inventorycontrol for the user. In more detail, the user is provided with softwareapplications on-line that provide optimized maintenance, deliveries,etc. For example, the inventory control allows the user to ensure thatthe optimal amount of equipment is supplied to the user remote site atany instant in time. This encourages long-term use of the website,because a will come to depend on the optimal inventory control.

[0036] Another aspect of the business plan is to provide preventativemaintenance control in order to incentivize long-term subscription tothe web site. The web site provides reports and predictions of equipmentfailure based on the data analysis performed by the OLAP 110 on the datain the data mart. Further, the model may include a crisis response, suchas alarm and/or dial out capability for automatically alerting the userto an actual or potential crisis identified automatically by the OLAP110, sensors, etc. In addition, there may be provided the service ofreporting historical operations of the user's equipment or operation.Such a service is useful, from a business point of view because itaddresses user's business need to perform such reporting, as is normallyperformed by quality control departments.

[0037] Each of the above-described features of the business model aresources of revenue and can create and incentive for a user to retain thesubscription to the web site for a relatively long period of time.Further, it is by design that the business plan encourages the user toreturn to the site frequently to monitor the user's operations and tolock the user in so that the user need not seek any other source forproduct or service, particularly relating to the industrial field. Inessence, the business plan of the present invention provides a one-stopshopping forum that satisfies all or nearly all of the user's industrialapplication and product needs.

[0038] In addition to the embodiments of the aspects of the presentinvention described above, those of skill in the art will be able toarrive at a variety of other arrangements and steps which, if notexplicitly described in this document, nevertheless embody theprinciples of the invention and fall within the scope of the appendedclaims.

I claim:
 1. A method for delivering industrial control-related on-lineservices to a user during an on-line session, the method comprising thesteps of: providing access by a user over a network to a data miningsystem during the on-line session, the data mining system comprising atleast one application in communication with a data warehouse, the datawarehouse comprising data collected from among network-delivered dataoriginating with a plurality of industrial control systems, theapplication allowing the user to conduct analyses of data in the datawarehouse and to view the results of the analyses; and providing accessby the user to non-data mining industrial control-related content, theaccess provided during the same on-line session.
 2. The method accordingto claim 1, wherein the step of providing provides said on-line site onthe internet.
 3. The method according to claim 1, further comprising thestep of data mining said data warehouse.
 4. The method according toclaim 3, wherein the step of data mining data mines efficiency of plantequipment.
 5. The method according to claim 4, wherein the step of datamining generates a best practices chart from the data regarding theindustrial application.
 6. The method according to claim 5, wherein theindustrial application relates to the efficiency of an industrialboiler.
 7. The method according to claim 1, further comprising the stepof segregating the data warehouse into one or more data marts, each datamart having applicability with a different industrial application. 8.The method according to claim 7, further comprising the step of offeringto the user data mining access to one or more of said data marts.
 9. Themethod according to claim 1, further comprising the step of providingOLAP as the data mining application.
 10. The method according to claim7, wherein the step of segregating the data marts are segregatedaccording to a cross-section of equipment having similarcharacteristics.
 11. The method according to claim 10, wherein the stepof segregating segregates according to isothermal maps.
 12. The methodaccording to claim 10, wherein the step of segregating segregatesaccording to dissimilar features.
 13. The method according to claim 10,wherein the step of segregating further comprises the step of uploadingthe parameters by which the data marts are segregated.
 14. The methodaccording to claim 3, further comprising the step of continuouslyscanning the clients industrial processes to compile the data warehouse.15. The method according to claim 14, further comprising the step ofreporting the data mining results in real time to the client.
 16. Themethod according to claim 14, further comprising the step of scrubbingthe data when scanned by eliminating header data.
 17. The methodaccording to claim 5, wherein the step of generating a best practiceschart plots an efficiency curve.
 18. The method according to claim 5,further comprising the step of taking a snap shot of the best practicesof a particular industrial application at any moment in time.
 19. Themethod according to claim 18, further comprising the step of takingseveral snap shots of the best practices for a plurality of moments intime for said particular industrial application.
 20. The methodaccording to claim 19, further comprising the step of generating areport based on a study of the best practices across the plurality ofmoments of time to provide the user with trends information.